DataRoad - Statistics 2

Why?

Inferential stats are the tool used to test hypotheses we have on the population, which practically enables data-driven decisions confidently supported by significant statistical evidence.

What?

This course extends basic statistical concepts to cover inferential statistics, hypothesis testing with multiple samples, analysis of variance, correlation, and regression analysis. Students will learn to select appropriate statistical tests, analyze various types of data, build and interpret regression models, and understand the statistical principles underlying machine learning techniques.

Summaries

Files

Quizzes

Curriculum:

▶

Two Sample Tests of Hypothesis

Comparing two populations or samples using t-tests and z-tests, understanding differences between independent and paired samples, and properly interpreting confidence intervals and significance levels.

▶

Analysis of Variance

Using ANOVA to compare means across multiple groups, understanding between-group and within-group variance, applying post-hoc tests, and analyzing factorial designs.

▶

Correlation and Linear Regression

Measuring relationships between variables using correlation coefficients, building simple linear regression models, interpreting slope and intercept, and assessing model fit with coefficient of determination.

▶

Multiple Regression Analysis

Extending regression to multiple predictor variables, interpreting coefficients, handling multicollinearity, selecting variables, and validating model assumptions.

▶

Nominal Level Hypothesis Tests

Applying chi-square tests for categorical data, goodness-of-fit tests, tests of independence, and understanding the assumptions and limitations of non-parametric methods.

▶

Analysis of Ordinal Data

Working with ranked data, applying non-parametric tests such as Mann-Whitney U, Wilcoxon signed-rank test, Kruskal-Wallis test, and understanding when to use these alternatives to parametric tests.

Notes

The key is to know what to use, so focus on when each tool is used and for what. Practically, you will never have a work task "do Anova on this dataset please", rather you'll have to find "which of our techniques had better influence on customer retention" for example and you will need to figure out which statistical tool is useful and then apply it.

Statistics 2